Extracting Translations Verb Frames*
نویسندگان
چکیده
We describe a method for extracting translation verb frames (parallel subcategorization frames) from a parallel dependency treebank. The extracted frames constitute an important part of machine translation dictionary for a structural machine translation system. We evaluate our method independently, using a manually annotated test dataset, and conclude that the bottleneck of the method lies in quality of automatic word alignment of the training data.
منابع مشابه
LTG vs. ITG Coverage of Cross-Lingual Verb Frame Alternations
We show in an empirical study that not only did all cross-lingual alternations of verb frames across Chinese–English translations fall within the reordering capacity of Inversion Transduction Grammars, but more surprisingly, about 97% of the alternations were expressible by the far more restrictive Linear Transduction Grammars. Also, about 71% of the cross-lingual verb frame alternations turn o...
متن کاملExtracting Idiomatic Hungarian Verb Frames
We describe a machine learning method for collecting idiomatic fixed stem verb frames. Firstly we collect frequent frame candidates from the output of a partial parser, secondly we apply a certain idiomaticity metric to the list to get the most idiomatic frames. The extracted frames will be translated to English and used as a resource in a Hungarian-to-English machine translation system.
متن کاملCollocational Clashes in the Persian Translations of Tuesdays with Morrie
This study aimed at finding features of collocational deviations in the translations of Tuesdays with Mor- rie. In this direction, categories of collocations and collocational clashes, as well as causes of collocation- al clashes were explored. The present work investigated five Persian translations of the novel. All the books were examined completely and all possible collocational clashes were...
متن کاملTowards Automatic Extraction of Verb Frames
This article explores the possibilities of automatic extraction of both surface and valency frames of Czech verbs. First, it is clearly documented that the data from Prague Dependency Treebank is not sufficient for collecting enough examples of verb frames to build a large scale lexicon. As a solution, an approach to pick nice examples of sentences from any texts is suggested and thoroughly des...
متن کاملAutomatic Extraction of Subcategorization Frames from Spoken Corpora
We built a system for automatically extracting subcategorization frames (SCFs) from corpora of spoken language. The acquisition system, based on the design proposed by Briscoe & Carroll (1997) consists of a statistical parser, a SCF extractor, an English lemmatizer, and a SCF evaluator. These four components are applied in sequence to retrieve SCFs associated with each verb predicate in the cor...
متن کامل